Shortest path distance in random k-nearest neighbor graphs
نویسندگان
چکیده
Consider a weighted or unweighted k-nearest neighbor graph that has been built on n data points drawn randomly according to some density p on R. We study the convergence of the shortest path distance in such graphs as the sample size tends to infinity. We prove that for unweighted kNN graphs, this distance converges to an unpleasant distance function on the underlying space whose properties are detrimental to machine learning. We also study the behavior of the shortest path distance in weighted kNN graphs.
منابع مشابه
Nearest-neighbor Queries in Probabilistic Graphs
Large probabilistic graphs arise in various domains spanning from social networks to biological and communication networks. An important query in these graphs is the k nearestneighbor query, which involves finding and reporting the k closest nodes to a specific node. This query assumes the existence of a measure of the “proximity” or the “distance” between any two nodes in the graph. To that en...
متن کاملDensity estimation from unweighted k-nearest neighbor graphs: a roadmap
Consider an unweighted k-nearest neighbor graph on n points that have been sampled i.i.d. from some unknown density p on R. We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. The key insights are that local differences in link numbers can be used to estimate a local funct...
متن کاملPhase transition in the family of p-resistances
We study the family of p-resistances on graphs for p ≥ 1. This family generalizes the standard resistance distance. We prove that for any fixed graph, for p = 1 the p-resistance coincides with the shortest path distance, for p = 2 it coincides with the standard resistance distance, and for p → ∞ it converges to the inverse of the minimal s-t-cut in the graph. Secondly, we consider the special c...
متن کاملk-Nearest Neighbors in Uncertain Graphs
Complex networks, such as biological, social, and communication networks, often entail uncertainty, and thus, can be modeled as probabilistic graphs. Similar to the problem of similarity search in standard graphs, a fundamental problem for probabilistic graphs is to efficiently answer k-nearest neighbor queries (k-NN), which is the problem of computing the k closest nodes to some specific node....
متن کاملSemi-supervised Learning with Density Based Distances
We present a simple, yet effective, approach to Semi-Supervised Learning. Our approach is based on estimating density-based distances (DBD) using a shortest path calculation on a graph. These Graph-DBD estimates can then be used in any distancebased supervised learning method, such as Nearest Neighbor methods and SVMs with RBF kernels. In order to apply the method to very large data sets, we al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012